Analysis of the algorithm: From kernels to backup genes.

Kernelization section

The algorithm transformed the semantic similarity matrix to make it compatible with a kernel. Once this was done for each network and kernel type, it was integrated by kernel type. Below there is a general analysis of the properties of each matrix in the different phases of the process.

Annotations properties

Table 1. Annotation files descriptors

Net Min Max Average Standard_Deviation
biological_process_sim 1 1053 8.255678448065316 20.730757162592834
cellular_component_sim 1 5172 7.219656420451649 82.56741928666138
disease_sim 1 81 1.7829472267615112 2.3512948371338793
gene_PS_sim 1 108 2.1519058295964126 5.045578888786698
gene_TF_sim 1 5778 3.0320060963993143 74.46796191948712
gene_hgncGroup_sim 1 2294 2.261042369703787 23.27548667222241
genetic_interaction_effect_bicor_sim 1081.0 1086.0 1085.8046559870922 0.9687934667201032
molecular_function_sim 1 6791 4.87167748867193 54.08093332133869
pathway_sim 1 479 6.75436035343834 16.179001796162975
phenotype_sim 1 1307 23.941396322320227 48.71098651648341
protein_interaction_sim 1 7338 610.0699285559645 500.6662722348227

Matrix properties

Table 2. Similarity matrixes

Net Matrix_Dimensions Matrix_Elements Matrix_Elements_Non_Zero Matrix_Non_Zero_Density
biological_process_sim 16979x16979 288286441 256362631 0.8892635744877089
cellular_component_sim 17963x17963 322669369 320867108 0.9944145271502359
disease_sim 4162x4162 17322244 17228280 0.9945755295907389
gene_PS_sim 3020x3020 9120400 1666 0.00018266742686724266
gene_TF_sim 10213x10213 104305369 172858 0.0016572301278182525
gene_hgncGroup_sim 25136x25136 631818496 14457654 0.022882606462980154
genetic_interaction_effect_bicor_sim 17354x17354 301161316 202089222 0.6710331349461894
molecular_function_sim 17333x17333 300432889 296572456 0.9871504314562578
pathway_sim 10965x10965 120231225 159182 0.0013239655505464575
phenotype_sim 5077x5077 25775929 25699341 0.9970287006920294
protein_interaction_sim 18476x18476 341362576 11271627 0.03301951588272523

Table 3. Filtered similarity matrixes

Table 4. Uncombined kernel matrixes

Net Kernel Matrix_Dimensions Matrix_Elements Matrix_Elements_Non_Zero Matrix_Non_Zero_Density
biological_process ct 16979x16979 288286441 288286441 1.0
biological_process el 16979x16979 288286441 288286441 1.0
biological_process ka 16979x16979 288286441 256366715 0.8892777409534846
biological_process rf 16979x16979 288286441 288286441 1.0
cellular_component ct 17963x17963 322669369 322669369 1.0
cellular_component el 17963x17963 322669369 322669369 1.0
cellular_component ka 17963x17963 322669369 320870945 0.9944264185795708
cellular_component rf 17963x17963 322669369 322669369 1.0
disease ct 4162x4162 17322244 17322244 1.0
disease el 4162x4162 17322244 17322244 1.0
disease ka 4162x4162 17322244 17230892 0.9947263183684516
disease rf 4162x4162 17322244 17322244 1.0
gene_PS ct 3020x3020 9120400 2711416 0.29729134687075126
gene_PS el 3020x3020 9120400 10338 0.0011335029165387483
gene_PS ka 3020x3020 9120400 4686 0.0005137932546818122
gene_PS node2vec 3020x3020 9120400 9120400 1.0
gene_PS rf 3020x3020 9120400 10338 0.0011335029165387483
gene_TF ct 10213x10213 104305369 25826798 0.2476075608341887
gene_TF el 10213x10213 104305369 1072167 0.010279116121050298
gene_TF ka 10213x10213 104305369 183071 0.0017551445506127303
gene_TF node2vec 10213x10213 104305369 104305369 1.0
gene_TF rf 10213x10213 104305369 1072167 0.010279116121050298
gene_hgncGroup ct 25136x25136 631818496 631304330 0.9991862124909999
gene_hgncGroup el 25136x25136 631818496 326932198 0.5174463870079549
gene_hgncGroup ka 25136x25136 631818496 14482790 0.022922390040319426
gene_hgncGroup rf 25136x25136 631818496 326932198 0.5174463870079549
genetic_interaction_effect_bicor ct 17354x17354 301161316 301161316 1.0
genetic_interaction_effect_bicor el 17354x17354 301161316 301161316 1.0
genetic_interaction_effect_bicor ka 17354x17354 301161316 202106576 0.6710907585488171
genetic_interaction_effect_bicor rf 17354x17354 301161316 301161316 1.0
molecular_function ct 17333x17333 300432889 300432889 1.0
molecular_function el 17333x17333 300432889 300432889 1.0
molecular_function ka 17333x17333 300432889 296576663 0.9871644345835918
molecular_function rf 17333x17333 300432889 300432889 1.0
pathway ct 10965x10965 120231225 88805059 0.7386189319787767
pathway el 10965x10965 120231225 8648661 0.07193356800614815
pathway ka 10965x10965 120231225 170147 0.0014151648209522942
pathway node2vec 10965x10965 120231225 120231225 1.0
pathway rf 10965x10965 120231225 8648661 0.07193356800614815
phenotype ct 5077x5077 25775929 25775929 1.0
phenotype el 5077x5077 25775929 25775929 1.0
phenotype ka 5077x5077 25775929 25700323 0.997066798251966
phenotype rf 5077x5077 25775929 25775929 1.0
protein_interaction ct 18476x18476 341362576 341362576 1.0
protein_interaction el 18476x18476 341362576 341362576 1.0
protein_interaction ka 18476x18476 341362576 11290078 0.033073566916134355
protein_interaction node2vec 18476x18476 341362576 341362576 1.0
protein_interaction rf 18476x18476 341362576 341362576 1.0

Table 5. Integrated kernel matrixes

Integration Kernel Matrix_Dimensions Matrix_Elements Matrix_Elements_Non_Zero Matrix_Non_Zero_Density
integration_mean_by_presence ct 30486x30486 929396196 803162983 0.8641771791801051
integration_mean_by_presence el 30486x30486 929396196 577524902 0.6213979619085939
integration_mean_by_presence ka 30486x30486 929396196 373326596 0.4016872434024897
integration_mean_by_presence node2vec 15658x15658 245172964 192376918 0.7846579608997997
integration_mean_by_presence rf 30486x30486 929396196 577524902 0.6213979619085939
mean ct 30486x30486 929396196 803162285 0.8641764281548663
mean el 30486x30486 929396196 577524902 0.6213979619085939
mean ka 30486x30486 929396196 373326596 0.4016872434024897
mean node2vec 15658x15658 245172964 192376918 0.7846579608997997
mean rf 30486x30486 929396196 577524902 0.6213979619085939

Weight values